Overview

Dataset statistics

Number of variables11
Number of observations462
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory39.8 KiB
Average record size in memory88.3 B

Variable types

Numeric9
Categorical2

Alerts

adiposity is highly correlated with obesity and 1 other fieldsHigh correlation
obesity is highly correlated with adiposityHigh correlation
age is highly correlated with tobacco and 1 other fieldsHigh correlation
tobacco is highly correlated with ageHigh correlation
names is uniformly distributed Uniform
names has unique values Unique
tobacco has 107 (23.2%) zeros Zeros
alcohol has 110 (23.8%) zeros Zeros

Reproduction

Analysis started2022-11-01 20:24:22.426650
Analysis finished2022-11-01 20:24:40.103651
Duration17.68 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

names
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct462
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean231.9350649
Minimum1
Maximum463
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-11-01T16:24:40.243655image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile24.05
Q1116.25
median231.5
Q3347.75
95-th percentile439.95
Maximum463
Range462
Interquartile range (IQR)231.5

Descriptive statistics

Standard deviation133.9385851
Coefficient of variation (CV)0.5774831207
Kurtosis-1.203538107
Mean231.9350649
Median Absolute Deviation (MAD)116
Skewness0.001436279421
Sum107154
Variance17939.54458
MonotonicityStrictly increasing
2022-11-01T16:24:40.463653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
0.2%
3191
 
0.2%
3171
 
0.2%
3161
 
0.2%
3151
 
0.2%
3141
 
0.2%
3131
 
0.2%
3121
 
0.2%
3111
 
0.2%
3101
 
0.2%
Other values (452)452
97.8%
ValueCountFrequency (%)
11
0.2%
21
0.2%
31
0.2%
41
0.2%
51
0.2%
61
0.2%
71
0.2%
81
0.2%
91
0.2%
101
0.2%
ValueCountFrequency (%)
4631
0.2%
4621
0.2%
4611
0.2%
4601
0.2%
4591
0.2%
4581
0.2%
4571
0.2%
4561
0.2%
4551
0.2%
4541
0.2%

sbp
Real number (ℝ≥0)

Distinct62
Distinct (%)13.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.532204944 × 10-5
Minimum2.104199983 × 10-5
Maximum9.802960494 × 10-5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-11-01T16:24:40.689652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum2.104199983 × 10-5
5-th percentile3.228305785 × 10-5
Q14.565376187 × 10-5
median5.56916908 × 10-5
Q36.50364204 × 10-5
95-th percentile7.971938776 × 10-5
Maximum9.802960494 × 10-5
Range7.698760511 × 10-5
Interquartile range (IQR)1.938265853 × 10-5

Descriptive statistics

Standard deviation1.430315091 × 10-5
Coefficient of variation (CV)0.2585434027
Kurtosis-0.1226353987
Mean5.532204944 × 10-5
Median Absolute Deviation (MAD)9.344729596 × 10-6
Skewness0.063303759
Sum0.02555878684
Variance2.045801259 × 10-10
MonotonicityNot monotonic
2022-11-01T16:24:40.955654image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.406574394 × 10-529
 
6.3%
5.56916908 × 10-529
 
6.3%
6.103515625 × 10-525
 
5.4%
5.739210285 × 10-524
 
5.2%
7.181844298 × 10-521
 
4.5%
6.50364204 × 10-521
 
4.5%
5.917159763 × 10-520
 
4.3%
6.298815823 × 10-520
 
4.3%
5.25099769 × 10-518
 
3.9%
6.718624026 × 10-517
 
3.7%
Other values (52)238
51.5%
ValueCountFrequency (%)
2.104199983 × 10-51
 
0.2%
2.143347051 × 10-51
 
0.2%
2.183596821 × 10-51
 
0.2%
2.311390533 × 10-53
0.6%
2.356489773 × 10-52
0.4%
2.5 × 10-51
 
0.2%
2.550760127 × 10-51
 
0.2%
2.657030503 × 10-52
0.4%
2.770083102 × 10-52
0.4%
2.829334541 × 10-51
 
0.2%
ValueCountFrequency (%)
9.802960494 × 10-51
 
0.2%
9.611687812 × 10-51
 
0.2%
9.425959091 × 10-51
 
0.2%
8.8999644 × 10-53
 
0.6%
8.573388203 × 10-57
1.5%
8.416799933 × 10-51
 
0.2%
8.26446281 × 10-54
 
0.9%
7.971938776 × 10-57
1.5%
7.694675285 × 10-512
2.6%
7.431629013 × 10-58
1.7%

tobacco
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct214
Distinct (%)46.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.260969685
Minimum0
Maximum3.959695934
Zeros107
Zeros (%)23.2%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-11-01T16:24:41.250656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.3074151682
median1.319507911
Q31.977630156
95-th percentile2.745518255
Maximum3.959695934
Range3.959695934
Interquartile range (IQR)1.670214988

Descriptive statistics

Standard deviation0.9426485881
Coefficient of variation (CV)0.7475584857
Kurtosis-0.8942315863
Mean1.260969685
Median Absolute Deviation (MAD)0.7281646003
Skewness0.1347283054
Sum582.5679944
Variance0.8885863606
MonotonicityNot monotonic
2022-11-01T16:24:41.483657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0107
 
23.2%
2.04767251111
 
2.4%
1.55184557410
 
2.2%
0.69314484328
 
1.7%
1.7411011278
 
1.7%
1.8250930267
 
1.5%
1.7754143117
 
1.5%
2.7019200775
 
1.1%
0.81519310965
 
1.1%
1.3195079115
 
1.1%
Other values (204)289
62.6%
ValueCountFrequency (%)
0107
23.2%
0.15848931921
 
0.2%
0.20912791051
 
0.2%
0.24595094861
 
0.2%
0.27594593232
 
0.4%
0.30170881684
 
0.9%
0.32453422231
 
0.2%
0.34517490661
 
0.2%
0.36411284062
 
0.4%
0.3816778911
 
0.2%
ValueCountFrequency (%)
3.9596959341
0.2%
3.7592414891
0.2%
3.6244780731
0.2%
3.3144540172
0.4%
3.2877775721
0.2%
3.2776897441
0.2%
3.2607724381
0.2%
3.1917477081
0.2%
3.1776715231
0.2%
3.0314331331
0.2%

ldl
Real number (ℝ≥0)

Distinct329
Distinct (%)71.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.159086711
Minimum0.9979817686
Maximum1.313875503
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-11-01T16:24:41.739651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0.9979817686
5-th percentile1.081762775
Q11.126212749
median1.158107763
Q31.191976466
95-th percentile1.237224221
Maximum1.313875503
Range0.3158937347
Interquartile range (IQR)0.06576371711

Descriptive statistics

Standard deviation0.04911563628
Coefficient of variation (CV)0.04237442792
Kurtosis0.1886351067
Mean1.159086711
Median Absolute Deviation (MAD)0.03285621182
Skewness0.02874546156
Sum535.4980604
Variance0.002412345727
MonotonicityNot monotonic
2022-11-01T16:24:41.962649image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.1589058195
 
1.1%
1.1472543415
 
1.1%
1.1357083575
 
1.1%
1.1360260834
 
0.9%
1.0914934264
 
0.9%
1.1532124784
 
0.9%
1.126811824
 
0.9%
1.1886413643
 
0.6%
1.1233497633
 
0.6%
1.1222921533
 
0.6%
Other values (319)422
91.3%
ValueCountFrequency (%)
0.99798176861
0.2%
1.0067888051
0.2%
1.0364147941
0.2%
1.0448000141
0.2%
1.0474654631
0.2%
1.0551145481
0.2%
1.0557299561
0.2%
1.0569511721
0.2%
1.0587595161
0.2%
1.0605404821
0.2%
ValueCountFrequency (%)
1.3138755031
0.2%
1.3034858631
0.2%
1.2865070181
0.2%
1.280908731
0.2%
1.2778598411
0.2%
1.2756412791
0.2%
1.2746314871
0.2%
1.2729323311
0.2%
1.2660433221
0.2%
1.2654437271
0.2%

adiposity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct408
Distinct (%)88.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.4067316
Minimum6.74
Maximum42.49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-11-01T16:24:42.198651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum6.74
5-th percentile12.0065
Q119.775
median26.115
Q331.2275
95-th percentile37.1165
Maximum42.49
Range35.75
Interquartile range (IQR)11.4525

Descriptive statistics

Standard deviation7.780698596
Coefficient of variation (CV)0.306245554
Kurtosis-0.6984386244
Mean25.4067316
Median Absolute Deviation (MAD)5.7
Skewness-0.2146459286
Sum11737.91
Variance60.53927064
MonotonicityNot monotonic
2022-11-01T16:24:42.426650image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30.793
 
0.6%
27.553
 
0.6%
21.13
 
0.6%
29.33
 
0.6%
24.652
 
0.4%
29.182
 
0.4%
30.92
 
0.4%
30.112
 
0.4%
26.082
 
0.4%
32.032
 
0.4%
Other values (398)438
94.8%
ValueCountFrequency (%)
6.741
0.2%
7.121
0.2%
8.661
0.2%
9.281
0.2%
9.371
0.2%
9.391
0.2%
9.641
0.2%
9.692
0.4%
9.741
0.2%
10.051
0.2%
ValueCountFrequency (%)
42.491
0.2%
42.171
0.2%
42.061
0.2%
41.051
0.2%
40.61
0.2%
39.971
0.2%
39.711
0.2%
39.681
0.2%
39.661
0.2%
39.641
0.2%

famhist
Categorical

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
Absent
270 
Present
192 

Length

Max length7
Median length6
Mean length6.415584416
Min length6

Characters and Unicode

Total characters2964
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPresent
2nd rowAbsent
3rd rowPresent
4th rowPresent
5th rowPresent

Common Values

ValueCountFrequency (%)
Absent270
58.4%
Present192
41.6%

Length

2022-11-01T16:24:42.627656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-01T16:24:42.783690image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
absent270
58.4%
present192
41.6%

Most occurring characters

ValueCountFrequency (%)
e654
22.1%
s462
15.6%
n462
15.6%
t462
15.6%
A270
9.1%
b270
9.1%
P192
 
6.5%
r192
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2502
84.4%
Uppercase Letter462
 
15.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e654
26.1%
s462
18.5%
n462
18.5%
t462
18.5%
b270
10.8%
r192
 
7.7%
Uppercase Letter
ValueCountFrequency (%)
A270
58.4%
P192
41.6%

Most occurring scripts

ValueCountFrequency (%)
Latin2964
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e654
22.1%
s462
15.6%
n462
15.6%
t462
15.6%
A270
9.1%
b270
9.1%
P192
 
6.5%
r192
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII2964
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e654
22.1%
s462
15.6%
n462
15.6%
t462
15.6%
A270
9.1%
b270
9.1%
P192
 
6.5%
r192
 
6.5%

typea
Real number (ℝ≥0)

Distinct54
Distinct (%)11.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.1038961
Minimum13
Maximum78
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-11-01T16:24:42.950656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile36
Q147
median53
Q360
95-th percentile69
Maximum78
Range65
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.817534116
Coefficient of variation (CV)0.1848740834
Kurtosis0.4704023399
Mean53.1038961
Median Absolute Deviation (MAD)6
Skewness-0.3464377547
Sum24534
Variance96.38397611
MonotonicityNot monotonic
2022-11-01T16:24:43.188689image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5225
 
5.4%
5723
 
5.0%
5421
 
4.5%
5021
 
4.5%
4920
 
4.3%
6018
 
3.9%
5618
 
3.9%
5517
 
3.7%
6117
 
3.7%
4717
 
3.7%
Other values (44)265
57.4%
ValueCountFrequency (%)
131
 
0.2%
201
 
0.2%
251
 
0.2%
261
 
0.2%
281
 
0.2%
291
 
0.2%
302
0.4%
312
0.4%
321
 
0.2%
334
0.9%
ValueCountFrequency (%)
781
 
0.2%
771
 
0.2%
751
 
0.2%
742
 
0.4%
732
 
0.4%
724
0.9%
712
 
0.4%
705
1.1%
697
1.5%
686
1.3%

obesity
Real number (ℝ≥0)

HIGH CORRELATION

Distinct400
Distinct (%)86.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2733598884
Minimum0.2151395254
Maximum0.3412503191
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-11-01T16:24:43.415656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0.2151395254
5-th percentile0.2465298384
Q10.2618649047
median0.2724698541
Q30.285379676
95-th percentile0.3006890692
Maximum0.3412503191
Range0.1261107936
Interquartile range (IQR)0.02351477135

Descriptive statistics

Standard deviation0.01707294739
Coefficient of variation (CV)0.06245593487
Kurtosis0.5194706974
Mean0.2733598884
Median Absolute Deviation (MAD)0.01136377502
Skewness-0.0002731432448
Sum126.2922684
Variance0.0002914855325
MonotonicityNot monotonic
2022-11-01T16:24:43.658656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.27656648514
 
0.9%
0.27127537244
 
0.9%
0.28777283953
 
0.6%
0.2907403783
 
0.6%
0.27603428453
 
0.6%
0.29037015943
 
0.6%
0.26643948513
 
0.6%
0.26222413653
 
0.6%
0.27169239883
 
0.6%
0.28736475923
 
0.6%
Other values (390)430
93.1%
ValueCountFrequency (%)
0.21513952541
0.2%
0.2167492041
0.2%
0.22474798361
0.2%
0.22787970181
0.2%
0.23145539581
0.2%
0.23410861041
0.2%
0.23485776131
0.2%
0.23528602411
0.2%
0.2372866661
0.2%
0.23833603551
0.2%
ValueCountFrequency (%)
0.34125031911
0.2%
0.31646133591
0.2%
0.31603445421
0.2%
0.31546840051
0.2%
0.31221296181
0.2%
0.31153534141
0.2%
0.31126573051
0.2%
0.30959895861
0.2%
0.30699583871
0.2%
0.30603921291
0.2%

alcohol
Real number (ℝ≥0)

ZEROS

Distinct249
Distinct (%)53.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.263569291
Minimum0
Maximum7.364636603
Zeros110
Zeros (%)23.8%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-11-01T16:24:43.886686image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.7638851554
median2.239993375
Q33.558795126
95-th percentile5.370803223
Maximum7.364636603
Range7.364636603
Interquartile range (IQR)2.794909971

Descriptive statistics

Standard deviation1.799918389
Coefficient of variation (CV)0.795168231
Kurtosis-0.6164388224
Mean2.263569291
Median Absolute Deviation (MAD)1.379176308
Skewness0.4009449211
Sum1045.769012
Variance3.239706205
MonotonicityNot monotonic
2022-11-01T16:24:44.312657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0110
 
23.8%
1.33520173516
 
3.5%
0.76388515548
 
1.7%
2.9063304825
 
1.1%
4.5101760955
 
1.1%
2.6199054825
 
1.1%
2.3235923295
 
1.1%
2.3348447075
 
1.1%
1.7600975114
 
0.9%
1.7075364784
 
0.9%
Other values (239)295
63.9%
ValueCountFrequency (%)
0110
23.8%
0.51463751391
 
0.2%
0.58343078241
 
0.2%
0.67186294552
 
0.4%
0.76388515548
 
1.7%
0.81519310961
 
0.2%
0.85704488062
 
0.4%
0.86206425221
 
0.2%
0.88652847162
 
0.4%
0.94145459721
 
0.2%
ValueCountFrequency (%)
7.3646366031
0.2%
7.3264617991
0.2%
7.3003721031
0.2%
6.7875950211
0.2%
6.5499945111
0.2%
6.5068306271
0.2%
6.3176419591
0.2%
6.2383035881
0.2%
6.1190205351
0.2%
6.0741131311
0.2%

age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct49
Distinct (%)10.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.81601732
Minimum15
Maximum64
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2022-11-01T16:24:44.528656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile17
Q131
median45
Q355
95-th percentile62
Maximum64
Range49
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.60895644
Coefficient of variation (CV)0.3412030675
Kurtosis-1.01622901
Mean42.81601732
Median Absolute Deviation (MAD)12
Skewness-0.3817342585
Sum19781
Variance213.4216084
MonotonicityNot monotonic
2022-11-01T16:24:44.751656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
1620
 
4.3%
5817
 
3.7%
1717
 
3.7%
6116
 
3.5%
5916
 
3.5%
5516
 
3.5%
6015
 
3.2%
4514
 
3.0%
5314
 
3.0%
4914
 
3.0%
Other values (39)303
65.6%
ValueCountFrequency (%)
153
 
0.6%
1620
4.3%
1717
3.7%
188
 
1.7%
192
 
0.4%
206
 
1.3%
213
 
0.6%
232
 
0.4%
246
 
1.3%
254
 
0.9%
ValueCountFrequency (%)
6413
2.8%
638
1.7%
6212
2.6%
6116
3.5%
6015
3.2%
5916
3.5%
5817
3.7%
578
1.7%
569
1.9%
5516
3.5%

chd
Categorical

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
0
302 
1
160 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters462
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0302
65.4%
1160
34.6%

Length

2022-11-01T16:24:44.938657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-01T16:24:45.084657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0302
65.4%
1160
34.6%

Most occurring characters

ValueCountFrequency (%)
0302
65.4%
1160
34.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number462
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0302
65.4%
1160
34.6%

Most occurring scripts

ValueCountFrequency (%)
Common462
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0302
65.4%
1160
34.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII462
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0302
65.4%
1160
34.6%

Interactions

2022-11-01T16:24:37.949651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:23.254654image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:25.378649image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:27.202656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:29.015657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:30.866656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:32.676657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:34.392658image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:36.106655image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:38.110690image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:23.421651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:25.561657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:27.383656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:29.220652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:31.035651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:32.849650image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:34.563656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:36.282686image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:38.299686image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:23.625651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:25.788652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:27.592656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:29.435652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:31.238651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:33.055656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:34.768651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:36.482653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:38.485652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:23.820652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:25.996651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:27.798648image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:29.627697image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:31.436651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:33.256657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:34.966655image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:36.681651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:38.656656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:24.014690image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:26.203657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:28.004656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:29.810655image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:31.619654image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:33.450656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:35.161649image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:36.863658image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:38.832652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:24.202651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:26.406652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:28.210654image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:30.003688image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:31.811651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:33.650651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:35.349656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:37.049656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:39.006648image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:24.866651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:26.610651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:28.417650image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:30.186653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:32.094653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:33.846652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:35.534657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:37.409651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:39.189653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:25.044686image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:26.814652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:28.630652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:30.499656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:32.307652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:34.037651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:35.720654image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:37.595656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:39.362652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:25.218654image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:27.018652image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:28.822651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:30.696653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:32.505653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:34.226689image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:35.925650image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-01T16:24:37.782655image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-11-01T16:24:45.232689image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-11-01T16:24:45.473657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-01T16:24:45.717657image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-01T16:24:45.962656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-01T16:24:46.189691image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-11-01T16:24:46.369650image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-01T16:24:39.665688image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-01T16:24:39.998656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

namessbptobaccoldladiposityfamhisttypeaobesityalcoholagechd
010.0000392.7019201.19073623.11Present490.2746326.238304521
120.0000480.1584891.15996228.61Absent550.2605081.335202631
230.0000720.3641131.13281232.28Present520.2595401.707536460
340.0000352.2388471.20416438.03Present510.2500313.580604581
450.0000562.8406361.13346227.78Present600.2716925.051066491
560.0000572.0747071.20528736.21Present620.2539502.885226450
670.0000501.7497741.12951416.20Absent590.2969551.470011380
780.0000771.7549471.16461214.60Present620.2847612.142633581
890.0000770.0000001.14372019.40Present490.2765661.440389290
9100.0000570.0000001.19218330.96Present690.2561630.000000531

Last rows

namessbptobaccoldladiposityfamhisttypeaobesityalcoholagechd
4524540.0000421.9819381.12335028.81Present610.2710264.493005420
4534550.0000651.2068351.21857939.68Present360.2515800.000000511
4544560.0000470.8365121.17032028.02Absent600.2633032.323592391
4554570.0000611.3807001.10963126.48Absent480.2806764.681496271
4564580.0000350.6931451.15181942.06Present560.2466431.335202570
4574590.0000220.6931451.19583231.72Absent640.2620400.000000580
4584600.0000301.7754141.15996232.10Absent520.2614533.227917521
4594610.0000861.5518461.04746515.23Absent400.3011673.717181550
4604620.0000721.9631681.27786030.79Absent640.2662063.563422400
4614630.0000570.0000001.17032033.41Present620.3412500.000000461